Linked Open Data for Environment Protection in Smart Regions
نویسندگان
چکیده
Many different open information sources currently exist for protecting the environment in Europe, mainly focused on Natura 2000 network, and areas where environmental protection and activities like tourism need to be balanced. Managing these data and integrating them for supporting decision makers and for novel uses is a challenging task. The SmartOpenData project (2013-1015) aims to define mechanisms for acquiring, adapting and using Open Data provided by existing sources for environment protection in European protected areas. Through target pilots in these areas, the project will harmonise metadata, improve spatial data fusion and visualisation and publish the resulting information according to user requirements and Linked Open Data principles to provide new opportunities for use. SmartOpenData will be based on previous experiences of Habitats project, which defined models and tools for managing spatial data in environmental protection areas. This paper provides an introduction to the SmartOpenData with a specific focus on the motivation, goals, and technical focus of the project, and outlines the architecture of the approach taken by SmartOpenData. 1 SmartOpenData Overview There exist many different open data sources for protecting biodiversity and environmental research in Europe in coastal zones, agricultural areas, forestry, etc., mainly focused on the Natura 2000 network, and areas where environmental protection and activities like agriculture, forestry or tourism need to be balanced with the Habitats Directive and the European Charter for Sustainable Tourism in Protected Areas. Better understanding and managing these data not * Authors listed in alphabetical order. Contact author: Dumitru Roman ([email protected]). 1 http://www.natura.org/ 2 http://ec.europa.eu/environment/nature/legislation/habitatsdirective/index_en.htm 3 http://www.european-charter.org/home/ only can provide economic value of these areas (value currently largely unknown), but will enable organizations to develop new services based on these data and open up new possibilities for public bodies and rural and protected areas to benefit from using data in novel ways, improving their knowledge and environment protection through new innovation ecosystems. In this context, the SmartOpenData project has set its goals to: Create a sustainable Linked Open Data infrastructure in order to promote environmental protection data sharing among public bodies in the European Union; Enhance Linked Open Data with semantic support by integrating semantic technologies built upon connected Linked Open Data catalogues aiming at building sustainable, profitable and standardised environment protection and climate change surveillance services; Define business models specially focused on SMEs and based on innovative services as new opportunities to align research results, previous work and projects, tackling active involvement of the whole value chain in Smart Regions at policy, industry and society levels; Demonstrate the impact of the sharing and exploiting data and information from many varied resources, in rural and European protected areas by providing public access to the data and developing demonstrators that will show how services can provide high quality results in regional development working with semantically integrated resources. 2 Building upon Previous Results: The HABITAS Project The Habitats project was built as an environment that enables to share and combine data from various sources. The project results were validated through the Habitats Reference Laboratory and pilot applications. On the basis of different pilots, Habitats defined and tested harmonisation rules for spatial environmental data and designed the concept of Reference Laboratory as a tool for testing the interoperability and supporting unification of outputs across different pilots. The challenges faced by Habitats were mainly due to data availability, integration and usage ability for decision-making and, in particular, in terms of its focus on Metatada, Data Specifications, Network Services, Data and Service Sharing and Monitoring and Reporting. Habitats is to support the EU INSPIRE Directive. The specific usage scenarios, including the state of the art baseline and user requirements coming from them represent the key input for the planned data and meta-data modeling activities and the SDI services that were developed in the Habitats project. Generally speaking, a positive correlation in all the pilots was detected, between service development and user satisfaction, while on the other hand, it cannot be taken for granted that the new services provided are also INSPIRE compliant. This can be due to several reasons, two of which seem more prominent than others: On the “supply side”, the cost of increasing the compliance, in terms of time, resources, etc., from the perspective of the SDI “owner”; On the “demand side”, lack of interest or simply ignorance of the advantages of compliance, from the perspective of the end users. 4 SmartOpenData: "Linked Open Data for environment protection in Smart Regions” under the call FP7-ENV-2013-twostage concerning the Seventh Framework Programme (FP7) (2013-2015). 5 http://linkeddata.org/ 6 http://www.inspiredhabitats.eu/ 3 The SmartOpenData Approach Linked Open Data is emerging as a source of unprecedented visibility for environmental data that will enable the generation of new businesses as well as a significant advance for research in the environmental area. In order for this envisioned strategy to become a reality, it is necessary to advance the publication of existing environmental data, most of which is owned by public bodies. How Linked Open Data can be applied generally to spatial data resource and specifically to public open data portals, GEOSS Data-CORE, GMES, INSPIRE and voluntary data (OpenStreetMap, GEP-WIKI, etc.), and how it can impact the economic and sustainability progress in European Environment research and Biodiversity Protection are open questions that need to be addressed in order to benefit from an improved understanding and management of environmental data. The SmartOpenData project (2013-2015) will address these questions by defining mechanisms for acquiring, adapting and using Open Data with a particular focus on biodiversity and environment protection in rural and European protected areas and its National Parks. The vision of the SmartOpenData project is that environmental and geospatial data concerning rural and protected areas can be more readily available and re-usable, better linked with data without direct geospatial reference so different distributed data sources could be easily combined together. SmartOpenData will use the power of Linked Open Data to foster innovation within the rural economy and increase efficiency in the management of the countryside. The project will prove this in a variety of pilot programmes in different parts of Europe. SmartOpenData goal is making INSPIRE/GMES/GEOSS infrastructure better available for citizens, as well as for public and private organization. On one hand, Europe and EU invest hundreds of millions of Euros in building the INSPIRE infrastructure. On the other hand, public and private organizations, as well as citizens use for their applications Google maps. National and regional SDIs offer information which is not available on Google, but this potential is not used. One of the main goals of SmartOpenData is making European Spatial Data easily re-usable not only by GIS experts but also by various organizations and individuals at a larger scale. To realize this, on a technical level, the project will: Harmonise geospatial metadata (ISO19115/19119 based) with principles of Semantic Web; Provide spatial data fusion introducing principles of Open Linked Data; Improve spatial data visualisation of Geospatial Open Linked Data; Publish the resulting information according to user requirements and Linked Open Data principles. In the context of the SmartOpenData project, using linked data for spatial data means identifying possibilities for the establishment of semantic connections between INSPIRE/GMES/GEOSS and Linked Open Data spatial related content in order to generate added value. The project requirements are within the environmental research domain. This will be achieved by making existing “INSPIRE based” relevant spatial data sets, services and appropriate metadata available through a new Linked Data structure. In addition, the proposed infrastructure will provide automatic search engines that will crawl additional available geospatial resources (OGC and RDF structures) across the deep and surface web. The main motivation to utilise the potential of Linked Data is to enrich the INSPIRE spatial content to enable improved related services to be offered and to increase the number, performance and functionality of applications. In many cases querying data in INSPIRE (GEOSS) based data infrastructure (driven mainly by relation databases) is time consuming and often it is not sufficient and understandable for common Web users. In large databases such queries can take minutes or hours. In the cases of distributed databases such a query is almost impossible or very complicated. SmartOpenData aims to improve this situation dramatically. The most advanced technical effort to reconcile the Linked Data and Geospatial Data worlds is embodied by OGC's GeoSPARQL standard. This merges the two technologies, with the GeoSPARQL engine translating queries back and forth between RDF and geospatial engines. The number of implementations of GeoSPARQL is growing but there remains some debate as to whether it is the best approach. The NeoGeo vocabulary is favoured by French mapping agency IGN and handles geospatial data differently by linking to it from the RDF, rather than transporting large literals. The INSPIRE standards have been developed entirely in an XML-centric manner and the European Commission's JRC is now working on making better use of linked data. This is being done in a W3C Community Group focussing on locations and addresses. A related, but separate, Community Group is also considering better interplay between Web and geospatial technologies. What these activities all suggest is that there is work to be done to allow geospatial and linked data specialists to communicate easily, avoiding the so-called religious wars. SmartOpenData brings together specialists in both disciplines: RDF to describe a location or point of interest, GI to define where it is on the Earth's surface. Another important problem to be addressed in the context of SmartOpenData is multilingualism. The problem of translating geographical data and metadata has not yet been solved inside INSPIRE or GEOSS. It brings problems of global data utilisation by local communities and local data by foreigners. Translation of geographical data is a big challenge for everyone within the SDI community and its importance will grow in relation with growing of SDI. The implementation of RDF should help ease the translation of geographic names or keywords from vocabularies like GEMET or AgroVoc. The research focus for SmartOpenData will address how to use existing GI data within an RDF framework, or, from the other direction, how existing GI data can be accessed as part of linked data. To achieve this, new algorithms will be developed that expose the wealth of environmental data as linked data. This may require some human intervention in some cases but such intervention will be minimised with a view to making it repeatable and scalable. For example, the Open Refine tool allows the same operation to be carried out on tabular datasets of unlimited size and is likely to be useful in this task, perhaps supported by a SmartOpenData reconciliation API. In a linked data environment, the definition of points, lines and polygons remains untouched but the relationships between features, the names of places and, in particular, the identifiers, are handled differently. Separating those elements out and encoding them as linked data, and doing so at scale, will be a significant challenge. Creating the data as RDF and storing it in dedicated triple stores is only the first step, however. More difficult is the discovery of links to data already available in the linked open data cloud, such as GeoSpecies. The example given on the GeoSpecies Web site shows detail of the Cougar including links to where it can be expected to be found. It is links between datasets that makes linked open data so powerful and forging those links is an essential aspect of realising the objectives of SmartOpenData. There are two principal approaches to machine translation: rule-based and statistical. Current state-of-the-art machine translation (MT) technology is based on the SMT (statistical MT) paradigm, which assumes the application data to match the training data, used during the learning phase to extract and generalise the parameters of the system. Combined methods are also being investigated currently, bringing together the linguistic and translation knowledge accumulated over the last 40 years with the SMT systems as deployed today. For SMT systems, 7 http://www.opengeospatial.org/standards/geosparql 8 http://geosparql.org, https://twitter.com/marin_dim/status/271573164268609536, http://www.strabon.di.uoa.gr 9 http://geovocab.org/doc/neogeo 10 http://www.w3.org/community/locadd 11 http://www.w3.org/community/geosemweb 12 http://openrefine.org (formerly Google Refine) 13 http://datahub.io/dataset/geospecies/resource/47e71c4c-9565-4185-b8c0-bdef6449278e the more distant the actual data is from the data used for training, the worse the results are. As we are concerned with environmental and geographical data, we will explore resource-limited adaptation to those domains in the context of SmartOpenData. Another area of research for the project will be the handling of large volumes of real time data. This puts a strain on the infrastructure and so methods to reduce that stress will need to be researched, possibly using the W3C POWDER technology as a data compression tool. Tracking the provenance of any data is important of course but as yet there is no (standardised) linkage within the Semantic Web technology stack between Provenance and SPARQL Update. 4 The SmartOpenData Architecture The SmartOpenData infrastructure is depicted in the following figure where three main elements can be identified. Figure 1. SmartOpenData Infrastructure In the lower level the external data sources are depicted. Data sources can be grouped in two different sets. The first one is composed by data sources that fulfil some of the standards supported by SmartOpenData (green boxes). The second group is composed by data sources that does not fulfil those standards (blue boxes). In the upper layer, three different scenarios 14 http://www.w3.org/standards/techs/powder#w3c_all 15 http://www.w3.org/standards/techs/provenance#w3c_all have being identified: scenario for researches, scenario for companies and scenario for endusers. Each scenario will focus on one specific segment using the functionalities provided by the SmartOpenData System, creating services that take advantage of such data and provide valuable services for each community illustrating how the availability of such services and the corresponding data can provide advantages for them. Between the external data sources and the data consumer in the scenarios the SmartOpenData System is placed providing key functionalities. The most basic element of the SmartOpenData System is the harmonisation of data sources. This element offers an open data source layer that exposes the external data sources fully adapted to the open data standards supported by the project. If an external data source does not provide the information according to the required standard, and adaptation is required, which is depicted in the figure as an extra box, which provides such adaptation specifically tuned for each external data source. The open data source layer provided both semantic information of the data and data themselves. Over this open data source layer, three key functionalities are defined: Distributed semantic indexing, which provides a service for searching and locating data based on semantic information collected from all the available Data Sources; Distributed data access, which provides data collected from external data sources, as an extra data source for easier and uniform data gathering from the users at the identified scenarios; Administration and notification, which provides administration facilities for managing users, workflows and data to data providers. These three functional components are coordinated inside the SmartOpenData System, creating a distributed service system which can be accesses transparently from the scenarios. It is also important to note that it will be possible for services created on the scenarios to access directly external data sources selected thought the distributed semantic indexing functionality of the SmartOpenData System if they are provided using one standard as shown on the picture.
منابع مشابه
Geospatial Data Based Environment for Educational and Gaming Purposes: The Pilot INSPIRE4Youth
The SDI4Apps Open INSPIRE4Youth supports creativity, technical capabilities, skills, knowledge and also relations, through the sharing of spatial based content and educational materials around environment. Using new methods of digital cartography enables to go beyond linguistic barriers. Using principles of Linked Open Data INSPIRE4Youth offer new possibilities of analyzing relation among diffe...
متن کاملProviding Personalized Cultural Heritage Information for the Smart Region - A Proposed Methodology
In this paper we present a methodology to provide visitors, in smart regions, additional cultural heritage attractions based on prior museum visits using user models and Linked Open Data. Visitor preferences and behavior are tracked via a museum mobile guide and used to create a visitor model. Semantic models and Linked Open Data support the representation of regional assets as Cultural Objects...
متن کاملThe Open Platform Protection Profile (op3) Taking the Common Criteria to the Outer Limits
The Open Platform Specification sets a new standard for smart cards, governing the loading, installation and deletion of applications at any time that the card is on-line during the card lifecycle prior to card termination. The Open Platform Protection Profile (OP3) recasts the Open Platform (OP) security requirements into the language of the Common Criteria (CC) to facilitate the formal evalua...
متن کاملRecognition and Analysis of Massive Open Online Courses (MOOCs) Aesthetics for the Sustainable Education
The present study was conducted to recognize and analyze the Massive Open Online Course (MOOC) aesthetics for sustainable education. For this purpose, two methods of the exploratory search (qualitative) and the questionnaire (quantitative) were used for data collection. The research sample in the qualitative section included the electronic resources related to the topic and in the quantitative ...
متن کاملSmart Cards: The Open Platform Protection Profile (OP3)
Global Platform’s “Open Platform Specification” sets a new cross-industry standard for smart cards, governing the loading, installation and removal of applications at any time that the card is on-line during the card lifecycle prior to card termination. The Open Platform Protection Profile (OP3) recasts the Open Platform (OP) security requirements into the language of the Common Criteria (CC) t...
متن کاملData Profiling in a Mobile Touristic Augmented Reality Application for Smart Environments based on Linked Open Data
Data profiling is an important step in understanding the nature of the datasets that belong to the Web of Data. In this paper, we plan to analyze the appropriateness of exploiting this type of data in an augmented reality application to be used by tourists in smart environments. We build on top of previous work and we analyze the data used in a case study done on integrating several user-genera...
متن کامل